-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added DataFrame::autoCast() #923
Conversation
Flow PHP - BenchmarksResults of the benchmarks from this PR are compared with the results from 1.x branch. Extractors+-----------------------+-------------------+------+-----+------------------+------------------+----------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+-----------------------+-------------------+------+-----+------------------+------------------+----------------+
| AvroExtractorBench | bench_extract_10k | 1 | 3 | 35.133mb +0.00% | 714.872ms +1.64% | ±1.07% -27.13% |
| CSVExtractorBench | bench_extract_10k | 1 | 3 | 4.818mb +0.21% | 329.707ms +4.00% | ±0.43% +11.91% |
| JsonExtractorBench | bench_extract_10k | 1 | 3 | 4.883mb +0.20% | 937.695ms -0.41% | ±0.84% +45.44% |
| ParquetExtractorBench | bench_extract_10k | 1 | 3 | 239.587mb +0.00% | 1.149s +1.93% | ±0.59% +9.67% |
| TextExtractorBench | bench_extract_10k | 1 | 3 | 4.667mb -0.08% | 32.399ms +5.41% | ±0.52% +2.19% |
| XmlExtractorBench | bench_extract_10k | 1 | 3 | 4.668mb -0.08% | 423.638ms -0.41% | ±0.38% -72.75% |
+-----------------------+-------------------+------+-----+------------------+------------------+----------------+
Transformers+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1 | 3 | 110.381mb -0.00% | 64.083ms -0.24% | ±0.20% -80.81% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders+--------------------+----------------+------+-----+------------------+------------------+----------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+--------------------+----------------+------+-----+------------------+------------------+----------------+
| AvroLoaderBench | bench_load_10k | 1 | 3 | 94.745mb +0.01% | 456.951ms -0.36% | ±0.51% -53.96% |
| CSVLoaderBench | bench_load_10k | 1 | 3 | 54.852mb +0.02% | 71.610ms -0.85% | ±2.11% +28.07% |
| JsonLoaderBench | bench_load_10k | 1 | 3 | 105.338mb +0.01% | 55.212ms -0.85% | ±0.80% -47.32% |
| ParquetLoaderBench | bench_load_10k | 1 | 3 | 320.549mb +0.00% | 1.256s -1.62% | ±0.20% -71.95% |
| TextLoaderBench | bench_load_10k | 1 | 3 | 17.709mb +0.06% | 39.305ms -4.50% | ±0.67% +97.92% |
+--------------------+----------------+------+-----+------------------+------------------+----------------+
Building Blocks+-------------------------+----------------------------+------+-----+------------------+------------------+------------------+
| benchmark | subject | revs | its | mem_peak | mode | rstdev |
+-------------------------+----------------------------+------+-----+------------------+------------------+------------------+
| NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 115.996mb +0.01% | 403.944ms +4.47% | ±1.53% -24.86% |
| NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 59.715mb +0.02% | 199.745ms +3.00% | ±1.40% +54.06% |
| NativeEntryFactoryBench | bench_entry_factory | 1 | 3 | 14.840mb +0.07% | 42.036ms +0.76% | ±1.84% +86.20% |
| TypeDetectorBench | bench_type_detector | 1 | 3 | 59.402mb +0.00% | 326.987ms -1.47% | ±0.44% +126.30% |
| TypeDetectorBench | bench_type_detector | 1 | 3 | 14.325mb +0.01% | 64.595ms -0.99% | ±0.39% +8.22% |
| RowsBench | bench_chunk_10_on_10k | 2 | 3 | 76.461mb -0.00% | 3.230ms -7.02% | ±1.71% -11.53% |
| RowsBench | bench_diff_left_1k_on_10k | 2 | 3 | 96.254mb -0.00% | 181.356ms -1.08% | ±0.30% -67.32% |
| RowsBench | bench_diff_right_1k_on_10k | 2 | 3 | 74.779mb -0.00% | 18.049ms -3.10% | ±0.42% -83.78% |
| RowsBench | bench_drop_1k_on_10k | 2 | 3 | 77.701mb -0.00% | 1.683ms -11.92% | ±1.25% -60.97% |
| RowsBench | bench_drop_right_1k_on_10k | 2 | 3 | 77.701mb -0.00% | 1.657ms -16.99% | ±0.85% +42.65% |
| RowsBench | bench_entries_on_10k | 2 | 3 | 74.813mb -0.00% | 2.479ms -12.19% | ±1.41% +157.41% |
| RowsBench | bench_filter_on_10k | 2 | 3 | 75.342mb -0.00% | 14.433ms -1.91% | ±1.35% +148.49% |
| RowsBench | bench_find_on_10k | 2 | 3 | 75.342mb -0.00% | 14.640ms +0.90% | ±1.87% -43.84% |
| RowsBench | bench_find_one_on_10k | 10 | 3 | 73.245mb -0.01% | 1.794μs -5.88% | ±2.67% +9.43% |
| RowsBench | bench_first_on_10k | 10 | 3 | 73.245mb -0.01% | 0.400μs 0.00% | ±0.00% 0.00% |
| RowsBench | bench_flat_map_on_1k | 2 | 3 | 86.801mb -0.00% | 12.644ms -4.50% | ±2.52% +55.44% |
| RowsBench | bench_map_on_10k | 2 | 3 | 116.161mb -0.00% | 63.199ms +0.56% | ±1.26% +114.94% |
| RowsBench | bench_merge_1k_on_10k | 2 | 3 | 75.862mb -0.00% | 1.218ms -19.66% | ±0.49% -78.32% |
| RowsBench | bench_partition_by_on_10k | 2 | 3 | 79.209mb -0.00% | 58.616ms -0.82% | ±0.98% -63.20% |
| RowsBench | bench_remove_on_10k | 2 | 3 | 77.963mb -0.00% | 3.790ms -10.77% | ±0.38% -79.79% |
| RowsBench | bench_sort_asc_on_1k | 2 | 3 | 73.390mb -0.01% | 40.456ms +1.15% | ±1.34% -33.01% |
| RowsBench | bench_sort_by_on_1k | 2 | 3 | 73.391mb -0.01% | 39.660ms -2.54% | ±2.04% +54.65% |
| RowsBench | bench_sort_desc_on_1k | 2 | 3 | 73.390mb -0.01% | 40.552ms +0.13% | ±0.63% -46.92% |
| RowsBench | bench_sort_entries_on_1k | 2 | 3 | 75.687mb -0.00% | 7.323ms -0.13% | ±0.33% -80.75% |
| RowsBench | bench_sort_on_1k | 2 | 3 | 73.245mb -0.01% | 29.588ms -3.51% | ±1.33% -55.53% |
| RowsBench | bench_take_1k_on_10k | 10 | 3 | 73.245mb -0.01% | 13.466μs -2.83% | ±1.52% -14.99% |
| RowsBench | bench_take_right_1k_on_10k | 10 | 3 | 73.245mb -0.01% | 15.600μs -5.12% | ±0.52% -71.02% |
| RowsBench | bench_unique_on_1k | 2 | 3 | 96.255mb -0.00% | 186.812ms +1.70% | ±0.20% +4787.86% |
+-------------------------+----------------------------+------+-----+------------------+------------------+------------------+
|
{ | ||
return $rows->map(function (Row $row) { | ||
return $row->map(function (Entry $entry) { | ||
return $this->autoCast($entry); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to add later some additional caching layer here, that would keep a type and entry after first casting to avoid going through the same detection process over and over.
Of course it's still possible that not all rows would get the same type, but in that case we can catch exception and fallback to autoCast function.
Change Log
Added
Fixed
Changed
Removed
Deprecated
Security
Closes: #921
Description